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Abstract 

The social role of a participant in a social system is a label conceptualizing the circumstances under which 
she interacts within it. They may be used as a theoretical tool that explains why and how users participate in 
an online social system. Social role analysis also serves practical purposes, such as reducing the structure of 
complex systems to relationships among roles rather than alters, and enabling a comparison of social systems that 
emerge in similar contexts. This article presents a data-driven approach for the discovery of social roles in large 
scale social systems. Motivated by an analysis of the present art, the method discovers roles by the conditional 
triad censuses of user ego-networks, which is a promising tool because they capture the degree to which basic 
social forces push upon a user to interact with others. Clusters of censuses, inferred from samples of large scale 
network carefully chosen to preserve local structural properties, define the social roles. The promise of the method 
is demonstrated by discussing and discovering the roles that emerge in both Facebook and Wikipedia. The article 
concludes with a discussion of the challenges and future opportunities in the discovery of social roles in large 
social systems. 

1 Introduction and Motivation 

Why do people choose to participate and interact with others in a social system? This basic question lies at 
the heart of many sociological studies that examine the nature of interactions in a community. The question is 
theoretically associated with the social roles of community members, which is defined as a qualitative description 
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capturing the circumstances and reasons under which they choose to interact with others. The concept of a social 
role is fundamentally based on the notion of a user’s position within a social network ll^ 1^ . For example, 
with whom and how one decides to connect to others in a community is associated with how they are perceived 
by others ifT^ . the power they hold ll22l . and their ability to spread information and influence others Il98l . As a 
concrete illustration, consider a network of interactions among workers in a corporate office. Some workers have 
the social role “manager” as defined by who they are connected to socially: “managers” are responsible for the 
work of the “team members” he leads and report to an “executive”. Any person in the corporate office network in 
a similar position, even if they report to a different executive and managers a different team, is still perceived to 
have the social role “manager”. 

Extracting and understanding the social roles of a social system carries theoretical and practical importance. 
Theoretically, an analyst may integrate the social roles discovered in a social setting and the context of these 
interactions to formulate a thesis about the reasons why and how people interact within the system. For example, 
consider the typical interactions that may occur within a generic corporate office as well as the connotations 
of being labeled a “manager”. Analysts could infer that “managers” interact with “team members” based on 
the initiatives and projects assigned to them by “executives”. They may be required to balance the demands 
placed on them by executives along with the needs of the team, and serve as a broker that filters information from 
corporate leaders to others in the organization. Practically, the delineation of users by their social role facilitates the 
interpretation of complex social systems by simplifying their structure from connections among users to between 
roles ||8] |78] |95l . It also enables meaningful studies of communities across time and context (e.g., different types 
corporate offices) by comparing the structure of interactions between roles that are common among them. For 
example, meta-analysis of the social roles roles and the interactions among them roles across different groups can 
help designers create effective physical and digital spaces for communities and organizations to grow within 1(441 . 
Social role analysis is also useful to identify the types of users that may become influential BOl . and even reveal 
latent social structures within the systems l(58l . 

This article presents a new method to discover the social roles that exist in large scale online social systems. 
The methodology is motivated by an analysis of the present art, which either: (i) requires an analyst to presume 
the existence of roles beforehand; and/or (ii) mines the roles using features about the users and the structure of 
the system that may not have a basis in social theory. The approach discovers social roles by clustering users by 
their conditional triad census, which is a vector capturing the types and orientations of three way relationships 
their ego-network is composed of. The method is applied to a network of interactions from an online social 
network (Facebook) and a collaborative editing platform (Wikipedia). An analysis of the quality of the resulting 
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clusters and the ego-network structure of prototypical users demonstrate the utility of the proposed method. The 
article concludes with a discussion about the many opportunities and challenges for future research in social role 
discovery for large scale social systems. 

This article is organized as follows: Section [previews and assesses existing methods for social role discovery 
in large scale social systems. Section introduces the concept of a conditional triad census and the proposed 
methodology. Section analyzes the structure of the social roles mined from two large scale online social sys¬ 
tems. Important challenges and opportunities that remain in the analysis of social roles in large scale systems are 
presented in Section]^ Concluding remarks are offered in Section 

2 Discovering Social Roles 

Present methods to discover social roles in social systems may be classified into three types: (i) methods that 
define roles by notions of equivalence; (ii) methods that require the assertion of the roles existing in the system prior 
to analysis; and (iii) methods that define roles based on patterns among user attributes and system interactions. This 
section provides an overview of each type and their applicability to discover social roles in large scale systems. 

2.1 Equivalence based role discovery 

Longstanding methods to identify social roles are based on finding users who are in “equivalent” positions |[95l 
ii[ioiini, which may be defined in one of three ways. Given an undirected network G = {V, E) of users V 
connected by a set of relations E, structural equivalence requires two users i and j to be connected to be exactly 
the same set of others. In other words, for every relationship (f, x) G E that exists, the relation (j, x) must also 
exist. Under this definition, a user’s social role is precisely defined by the people that she is connected to. This 
strict definition may not be useful in many settings because it is impossible for two users whose distance is greater 
than two in a network to fall under the same role. For example, two “managers” in an office that report to a 
common “executive” but have difference sets of subordinates are not structurally equivalent and would therefore 
not be classified under the same role. 

Isomorphic equivalence offers a broader definition of equivalent network positions. An isomorphism among 
two users in a network exist if there is a mapping vr : E{a) —)• E{h) where E{a) is the set of relationships held 
by user a such that for every pair of users a,b £ E, we have (a, b) G E{a) if and only if (7r(a), 7r(6)) G E{b). In 
other words, users a and b must have isomorphic ego-networks, which is a tuple (14, Ee) where 14 is the set of all 
users in the 2”*^ degree neighborhood of a user and E^ represents the directed relationships that bind the users in 
14 together. This suggests that one could simply switch the location of user a and b and their connectivity to others 
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without disturbing the overall structure of the network. Practically, two “managers” in a network that report to an 
“executive” and lead the same number of “team members” would be isomorphically equivalent if the connectivity 
among the “team members” of the two “managers” were isomorphic. This equivalence definition thus captures a 
more intuitive notion for ascribing a user’s role in a social system. A still broader class is regular equivalence, 
which requires the role of the alters of two users to be identical. Specifically, if TZ{x) is a function that assigns 
a user x to a role, we say users a and b are regularly equivalent if Tl{a) = 7^(6) and if every user n in the ego- 
network N{a) of a can be mapped to a user m in the ego-network of b such that Tl{n) = TZ{m). For example, 
“managers” would be regularly equivalent so long as they both connect to “executives” and “team members”. 
Isomorphic and regular equivalences may be identified by performing a blockmodeling over the adjacency matrix 
of a social system |[89l . 

Notions of structural, isomorphic, and regular equivalence are decades old theories that have been instrumental 
in many social network analyses Il82ll^l2^l9ni^l30ll . More recent work have used these notions to study inter¬ 
national relationships across institutions Il69l . firms fT^ . governments 11101115^ . and to study peer influences 1361. 
Isomorphic equivalence has been applied to hospitals within referral networks |[50l to discover closed communi¬ 
ties of health services and hospitals that carry identical areas of expertise. They are also employed in the study of 
citation networks |[85l to identify researchers within an organization that perform similar research and offer similar 
domain expertise. Regular equivalences have been studied in networks of relations among gang members in urban 
settings f/Sl and of relations among cities across the world |2I- 

2.2 Implied role discovery 

In implied role analysis, a researcher defines the set of social roles users of a social system are expected to 
exhibit before any data or structural analysis commences. It is a qualitative, iterative process that generally follows 
the workflow of Figure [T] Based on at-hand information about a social system, roles are first defined based on the 
subset of functionality allowed by the system that the user may perform. For example, consider an online forum 
where users may decide to browse conversations but never post, or can become an administrator that edits and 
controls the behavior of others in the system. An analyst may therefore first define the social roles lurker (one who 
never posts), moderator (one who controls behavior), and poster (one who contributes to conversations). With 
these roles assumed to exist, the analyst studies the actions of users and their relations with others. The initial 
definitions of the social roles are then iteratively refined as evidence from the social system is collected. 

Implied role analysis is useful when a social system is well understood, highly structured, and if the analyst 
wishes to understand the interactions among users on the basis of the kinds of operations they perform. For 
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Figure 1: Workflow for implied-role analysis 

example, Nolker et al. tapped into their experiences with online bulletin board systems to predefine members of 
a Usenet group into the roles leader, motivator, and chatter iTTTl . They identified the behavioral attributes that are 
indicative of each role, and labeled users exhibiting such behaviors in a log of the group’s activity. Colder et al. also 
studied Usenet groups but proposed a different taxonomy of roles that include celebrities, ranters, lurkers, trolls, 
and newbies BTI . They sifted through conversations across different Usenet groups to study behaviors associated 
with each role. Gliwa et al. examined collections of online bloggers and defined the roles selfish infiuential 
user, social infiuential user, selfish infiuential blogger, social influential blogger, influential commentator, standard 
commentator, not active, and standard blogger I40l . Welser et al. defined four roles for Wikipedia users, namely 
substantive experts, technical editors, counter vandalism, and social networkers ||92]| . They subsequently searched 
for patterns about how users contribute and interact with others in order to classify the users falling in each role. 

2.3 Data-driven role discovery 

A third type of approach is to infer social roles by the features of a dataset without pre-defining the roles that 
exist. These data-driven approaches, whose workflow is summarized in Figure generally considers features 
about users and the structure of their ego-networks in an unsupervised machine learning algorithm. Social roles 
are defined as the groups the algorithm places users into based on the similarity of these features. Studies that 
apply unsupervised learners for social role discovery vary in sophistication. For example, Hautz et al. categorized 
users in an online community of jewelry designers by mapping whether their out- and in-degree distributions and 
frequency of interactions to “low” or “high” levels |@4l. Zhu et al. use fc-means clustering to identify user roles in a 
network of phone calls based on similar calling behaviors, ego-network clustering coefficients, and mean geodesic 
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Figure 2: Workflow for data-driven role analysis 

distances between users II102I . Chan et al. discover roles by agglomerative hierarchical clustering with over fifty 
behavioral and structural features of users’ across the post/reply network of many online forums lITSl . White et 
al. use a mixed membership probabilistic model to identify roles across online forums using behavioral features 
and found a number of possible assignments of users into groups 1931 . Rowe et al. use behavioral ontologies 
and semantic rules to automatically group online forum users into roles based on the content of their posts fill . 
Although data-driven approaches define similarity based on the structural features of ego-networks, this class of 
methods is not an approximation of equivalence based role discovery. This is because data-driven methods may 
search for the similarity of two users based on many feature types that are not structural, including their personal 
attributes, their behaviors on the social system, and the content of their interactions with others. 

2.4 Comparative analysis 

The recent availability of data about very large scale social systems, typically collected from online social 
networks (Facebook; Google-i-), social media (Twitter; Tumblr), and innovative information exchanges (Wikipedia; 
StackExchange) enables the study of the social roles of users in systems that have a world-wide reach. The massive 
scale of these systems necessitates the need to evaluate current approaches for discovering social roles, so that the 
most effective type given their size can be identified. 

Equivalence based role discovery comprises a number of well-studied, longstanding methods that has deep 
roots in sociological theory. Unfortunately, it may be infeasible to precisely identify users falling into isomorphic 
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or regular equivalence classes within large scale social systems. This is because the problem of finding isomorphic 
ego-networks is closely aligned to searching for all motifs of arbitrary size within the network, and the problem 
of identifying regularly equivalent positions is related to searching for a A:-coloring of G, with k unknown a priori 
(both are NP-hard problems ll52l l. Researchers still interested in identifying these equivalences in large systems 
must resort to numerical approximations based on quantitative notions of structural similarity between two users 
that may be difficult to apply and analyze in practice l[70ll32ll^ . Thus, despite the rich theory they are grounded 
within, technical challenges bar its adequate adaptation for large scale social systems. 

Implied role analyses carry fewer technical challenges. This is because the most difficult aspect - identifying 
the roles that exist - are predefined by an analyst before trends in the data are considered. However, implied role 
analyses runs the risk of using noisy signals in the data that appear by chance as evidence for the roles they have 
predefined. Furthermore, it is possible for separate analysts to define completely different sets of social roles for 
the same system, which may confuse or conflict each other. For example, Nolker et al. places Usenet members 
into leader, motivator, and chatter roles iTTTII . Are these roles compatible with the alternative set of celebrities, 
ranters, lurkers, trolls, and newbie roles proposed by Colder et al. for the same system iHTl ? It is unclear if one 
set of roles is more suitable than the other, or if the cross-product of the two types of roles (e.g. leader-celeberty 
or chatter-lurker) is also a valid set of roles. Furthermore, the implied roles tend to speak to the functionality 
or actions that users of the social system undertake instead of reflecting the reasons why they participate in the 
system and the way they are structurally embedded within it. Thus, although there are fewer technical challenges 
to run implied role analysis over large scale social systems, the resulting roles may have a weak relationship to 
sociological theory. 

Data-driven social role analysis may be a promising type of approach for the discovery of social roles in large- 
scale social systems. This is because modern day “big data” technologies enable the collection of incredible 
amounts of information about each user, their connections with others in the social system, and the details or 
the content of their interactions. Instead of assuming that specific kinds of social roles in the system must exist, 
data-driven analyses apply data mining algorithms or learn data models from which the social roles of the system 
emerge. Such approaches let the data inform the analyst what social roles exist, rather than require a definition 
of the roles before studying the data. Fortunately, recent big data systems and methods research enable the rapid 
mining and building of data models from large social systems. For example, Zhang et al. tackle computations over 
real-world and virtual social interaction data by performing Tucker decompositions of a tensor representation of 
the interactions . A distributed learning algorithm based on the MapReduce proposed by Tang et al. efficiently 
identifies the influencers and experts latent within large social systems |[84l . Cambria et al. use a comparative anal- 
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ysis of the performance of multiple natural language processing algorithms to find patterns in the content of social 
interactions m. Giannakis et al. present a series of articles that describe how sensor signal processing algorithms 
may be adapted to operate over big and social data sets Il38l . Malcom et al. even developed a uniform program¬ 
ming interface so that non-experts can utilize state-of-the-art big data technologies lIMl for social role analysis. 
However, the relationship of such analyses with longstanding social theory varies considerably. This is because 
while some data mining algorithms and models encode aspects of social theory in their technical development, 
others were given no consideration to these theories in their development or make assumptions that are incompati¬ 
ble with past social science research for sake of model tractability. Furthermore, algorithms and models for social 
role analysis may use features that do not reflect aspects of social forces that drive users to embed themselves in 
the network in a specific way |[T8]| . 

3 Triad-based Social Role Extraction 

In this section, a new data-driven approach for extracting social roles from large social systems is introducecQ 
Based on the discussion in the previous section, it only considers features that have a grounding in social theory, 
namely the conditional triads that compose each user’s ego-network. After network sampling and dimensionality 
reduction, /c-means clustering is applied to the vectors to identify social roles. Ego-networks falling closest to 
the centroid of each cluster is interpreted for role analysis. This section describes what conditional triads are, the 
triad-based representation of an ego-network, the social systems used to illustrate the methodology, and the role 
extraction process. 

3.1 Conditional Triad Census 

In social network analysis, a triad is a group of three individuals and the pairwise interactions among them 11791 . 
They are the smallest sociological unit from which the dynamics of a multi-person relationship can be observed, 
and hence, are considered to be the atomic unit of a social network ll^l90ll^ . For example, third actors may 
act as a moderating force that can resolve conflicts among two others ifT^ . They may also sabotage an existing 
relationship or induce a feeling of unwelcomeness to a specific alter 1^. Such observations have been used to 
develop theories that associate the configuration of a triad to specific underlying effects that promote specific 
kinds of social interactions ||4^ I^. 

Figure|^captures the 36 different ways an individual (white) can be oriented towards two alters (blue) within a 

'Parts of this method were presented at the First Workshop on Interaction and Exchange in Social Media at the 2014 International 
Conference on Social Informatics HD. 
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Figure 3: Types of conditional triads 


triad ifTSl . These orientations are the set of all conditional triads, which are defined by the structure of the three 
way relation based on the position of an individual within it. For example, triads 6 and 11 are structurally identical 
(having two null and one mutual tie). In triad 11, the white user is isolated whereas in triad 6 she is connected 
to an alter. The entire structure of an ego-network can thus be represented by the number and different types 
of conditional triads it is composed of. The conditional triad census ll^ |28l of an ego-network is defined as a 
36-element vector whose component represents the proportion of type i conditional triads it is composed of. 

Searching for ego-networks whose conditional triad censuses are similar is expected to lead to a meaningful 
grouping of users into social roles. This is because each triad configuration represents a sociological factor about 
how a user interacts with others ifTTIl . For example, triad 32 has a user on the receiving end of a chain of interac¬ 
tions. If these interactions represent the passage of information or rumors, it implies that the alter in the middle 
of the chain is capable of manipulating what becomes shared with the user and may not be trustworthy. In triad 
5, the user receives interactions from two alters but chooses not to reciprocate. Ego-networks largely composed 
of this triad suggests that the user receives many interactions but, for possibly selfish reasons, seldom chooses to 
reciprocate. By summarizing how frequently each of these triads appear, a conditional triad censuses succinctly 
models the strength of the different kinds of social factors that surround the nature of one’s interactions with others. 
These factors, taken together by considering the entire census as a vector, therefore represents the circumstances 
and reasons why a user participates in a social system. 

The number of and kinds of roles that exist in a social system can thus be identified by: (i) computing the 
conditional triad census of every user; and (ii) clustering users into groups based on the similarity (vector distance) 
of the conditional triad censuses. This approach is somewhat related to discovering social groups in networks by 
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138,592 
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1.54 (p > 0.999) 

1.83 (p > 0.999) 


Table 1: Dataset summary statistics 

searching for ego-networks that participate in similarly shaped /c-cliques Il43ll or -cores (sub-graphs where all 
nodes are connected to at least k others Il29ll ) 108l|76l|56l. However, searching for ego-networks that satisfy these 
strict requirements will only identify sets of nodes surrounded by a similarly dense network and leave hidden 
other nodes whose ego-networks are less connected but still have similar connectivity patterns. Such analysis 
also pays no consideration to the social forces or actions that drive users in cliques or cores to interact with each 
other, since the types of triads within the groups are ignored. Furthermore, it is difficult to know a priori what 
kinds of A:-cliques and -cores correspond to relevant social roles in a large-scale social system. In comparison, 
the proposed approach learns significant structural patterns of ego-networks based on a feature reflecting the types 
of social forces that bind a user and her connections together. It leads to a classification where users in the same 
group participate and interact with their contacts under similar social circumstances and forces, which speaks very 
closely to the notion of a social role. 

3.2 Dataset description 

The methodology is demonstrated by discovering social roles in two popular online social systems, namely 
Facebook and Wikipedia. These systems were chosen because they each serve a different purpose and provide 
distinct mechanisms for users to interact with each other. Facebook is used as a platform to informally share 
personal information, photos, and events with friends and family. Its interaction network is built by placing a 
directed edge from user a to 6 if a posts at least one message on the wall (a collection of public messages) of b. 
Wikipedia is an online encyclopedia with articles that are written and edited by an open community. Interactions 
on Wikipedia are defined by the modification of content contributed by another user; a directed edge from a to 
b is added if a edited the text, reverted a change, or voted on approving an action to an article made by b. Both 
the Facebook and Wikipedia networks were constructed from publicly available datasets ll^l64l . These datasets 
only record the act of an interaction; it does not include any information about the content of or the type of the 
interaction. Although the Facebook data set is dated (interactions were recorded in 2009), privacy improvements 
made to the Facebook API since make it all but impossible to capture such interactions at scale today. 

Table [T]presents summary statistics for these interaction networks, illustrating how they vary in size, shape, and 
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Figure 4: In- (top) and out-degree (bottom) distributions 

user behaviors. The relatively small size (46,952 users, 264,004 pairwise interactions) of the Facebook network is 
due to the fact that it only represents users within a single regional network (the Facebook social graph was divided 
by user regions in its earliest form). The data also only represents user’s whose accounts were shared publicly, 
which was the default Facebook setting during the data collection period lISTl . Despite its size and limit to a single 
regional network, previous work showed that all regional Facebook networks exhibited a similar structure (average 
path lengths and diameter) and shape (clustering coefficients and assortativity) |[96l . More recent studies further 
confirm that the structure and shape of these regional networks are very similar to the structure of the modern 
global Facebook network ll97l[86l : therefore this data set is expected to contain similar interaction patterns as seen 
in the global Facebook network. The Wikipedia network is almost three times the size of Facebook, with 138,592 
users and 740,397 distinct pairwise interactions, but its clustering coefficient C is approximately 55% smaller. 
These measurements suggest that Facebook users have a greater tendency to surround themselves within denser 
ego-networks compared to Wikipedia users. The lower clustering coefficient of Wikipedia could be explained by 
users who generally limit themsleves to modifying articles written by a specific group (perhaps representing a 
specific topic). 

The in- and out-degree distributions of each network is presented in Figure]^ which exhibit power-tailed shapes. 
The existence of power-law behavior is tested by a maximum likelihood approach lITO and the resulting power- 
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law exponents ain,out are given in Table [T] The estimates of the power-law exponent are very reliable {p > 0.95; 
note that the test considers the hypothesis Hq-. the empirical data follows a power-tailed distribution) except for 
the in-degree distribution of Facebook, which may be because its range only covers two orders of magnitude. 
A larger power-law exponent indicates that the distribution drops to zero faster in its right-tail lIMll . hence the 
frequency with which users interact with others on Wikipedia exhibits a smaller amount of variation compared 
to Facebook. In other words, it is less likely to find a user who interacts with an unexpectedly high number of 
others on Wikipedia compared to Facebook, and less likely to find a user receiving many interactions from others 
on Facebook compared to Wikipedia. 

3.3 Network sampling 

Computing the conditional triad census of every ego-network requires an examination of 0(|yp) triples of 
users in an interaction network. This computational cost may be an insurmountable burden to compute conditional 
triad censuses in larger interaction networks where the number of nodes are in the millions |[86l . Furthermore, 
existing algorithms that can compute censuses in 0{\V\‘^) lIMl or 0(|£'|) fSi] only considers users’ unconditional 
triad censuses. An unconditional triad census is a 16-element vector holding the proportion of all triads without 
regard to the position of the user in her ego-network, making them incompatible with the proposed approach. 
However, since the components of a conditional triad census are the proportions of triad types in an ego-network, 
the conditional censuses within a carefully selected sample of the original network should be representative of the 
conditional censuses in the original network. A sample of a network G is a new network Gs = (14, Eg) where 
14 C V, Eg C E, and |14| = with 0 < iji < 1. 

A sampling method must ensure that the two critical local structural properties of ego-networks, namely the 
degree distribution and local clustering coefficient distribution are preserved RhllSTl . For example, ego-networks 
with high degree will naturally tend to have triads with relations among multiple alters, and lower (higher) cluster 
coefficients indicate a greater proportion of open (closed) triads. However, naive methods for network sampling 
do a poor job of preserving these local features. A number of advanced sampling methods have been proposed, 
but each one can only preseve different types of structural features of the full network [FI. Therefore, four widely 
used graph sampling techniques for choosing 14 and Eg were compared by their ability to preserve the degree 
distribution of the users’ ego-network and their clustering coefficients. The techniques and their process are: 

1. Vertex Sampling (VS): Let 14 be a random sample of (/)|1/| vertices from V and define Eg to be the set of 
all edges among the vertices in 14 from G. 

2. Edge Sampling (ES): Randomly choose an edge e = (^ 1 ,^ 2 ) from E, add it to Eg, and add vi and V 2 to I 4 
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Figure 5: Comparison of graph sampling methods 


if they have not yet been added. Continue to choose edges from E until 1141 = (j)\V\. 

3. Forest Fire Sampling (FFS) 1591 : Choose a random vertex v from V, randomly select p/{l — p) of its 
outgoing edges, and add theses edges to Eg. Place every vertex incident to those added to Eg into a set 
14 of ‘burned vertices’ and update 14 by 14 = 14 U 14- Randomly choose a burned vertex from 14, and 
recursively repeat this process until |14| = </>|l^|- The parameter assignment p = 0.7 is used based on the 
recommendation of the method’s authors 1591 . 

4 . ES-i (ESI) |T]: Randomly choose an edge e = (ui, U2) from E and add vi and V2 to I4 if they have not yet 
been added (note that e is not added io Eg). Continue sampling until |14| = </>|l^|- Finally, define Eg to be 
the set of all edges among the vertices in I 4 from G. 


The Kolmogorov-Smimov distance metric D was used to compare how closely the degree and clustering coef¬ 
ficient distributions of samples Eg taken with each method follow the distribution of the original network E. It is 
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Figure 6: Triad Role Census Sample Values with 95% Confidence Intervals 

defined by the largest difference of a point taken from original distribution F to the distribution of the sample Fg'. 

D = sup |F's(x) — F{x)\ 

X 

Figure [^compares the average D over 100 samples taken by each method for different values of cj). For the Face- 
book network, FFS does the best job (D < 0.2) at preserving both degree and clustering coefficient distributions 
for modest sample sizes (cp > 0.33). FFS samples of the Wikipedia network best preserves the clustering co¬ 
efficient distribution for any sample size and 0.2 < cj) < 0.3 FFS and VS are similarly faithful to the original 
network’s degree distribution. Ultimately, FFS sampling is found to be able to preserve the local structure of both 
networks even for small sample sizes. 

A value of cj) that provided a reasonable trade-off between computational speed and sample consistency was 
searched for. Figure [^plots the average value of each component of conditional triad censuses taken from n = 20 
independently generated FFS samples of each network for cf) = 0.35 (triad 1 is excluded because of its dispro¬ 
portionately high frequency) and the 95% confidence interval of the proportions. The proportion of triad types 
across the samples are similar and feature small confidence intervals. Since the computation cost of computing 
triad censuses at this sampling level is very reasonable (less than 30 minutes in a parallel computation over three 
cores of an Intel i5 processor), the setting cj) = 0.35 is used for role analysis. 

3.4 Census clustering 

/c-means clustering, a common and flexible algorithm for discovering latent groups in data 1 1001 [37l . is 

used to separate users into roles, /c-means clustering defines k centroid positions in the vector space and assigns 
each conditional triad census (and hence user) to a cluster based on the centroid it is most similar to. Since 
the components of the censuses take a value between 0 and 1, this similarity is defined as the ^ 2 -norm of their 
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Wikipedia 




(a) Facebook 


(b) Wikipedia 


Figure 7: Scree plots 


difference vector. After the assignment of conditional triad censuses to clusters, the position of the centroid of 
each cluster is updated. Censuses are then reassigned to their closest centroid, and the process repeats until there 
are no changes to any cluster assignments. 

3.4.1 Dimensionality reduction 

Figurej^indicates that many components of the conditional triad censuses are close to or equal to 0. A dimension¬ 
ality reduction technique, namely principle component analysis (PCA) ll47l . is therefore applied to the conditional 
triad censuses. PCA identifies a projection of the data into a lower dimensional subspace that preserves as much 
variation within the original space as possible. Figurej^plots the proportion of variation within the original dataset 
that is retained when we use PCA to reduce the data into smaller numbers of principle components. The smallest 
dimensional space that still preserved a large proportion of the variation in the data (> 85%) was chosen, as indi¬ 
cated by the red line in Figure]^ The figure suggesfs fhaf PCA finds a significanfly lower dimensional space for 
clustering fhe condifional friad census of every nefwork, from 36 dimensions fo Jusf 6 and 3 for fhe Facebook and 
Wikipedia inferacfion nefworks respectively. 

3.4.2 Clustering evaluation 

A:-means clustering requires the number of clusters k to divide the data into to be chosen beforehand, forcing an 
analyst to assert the specific number of social roles that may exist in the system. Instead, the silhouette coefficient 
metric Il83ll is used to quantitatively evaluate the quality of clusters for different values of k, so that the k 

yielding the ‘best’ clustering is chosen. It is defined as follows: consider a division of censuses into k clusters 
= {Cl, C 2 , ...Cfc}. Let a(x) = (i(x, C*),x G Cj be the distance from the vector x to the centroid C* of its 
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Wikipedia 




(a) Facebook (b) Wikipedia 

Figure 8: Silhouette coefficients 


assigned cluster Ci (measuring intra-cluster distance) and /3(x) = min^ c distance from 

X to the centroid of the nearest cluster Cj x is not assigned to (measuring inter-cluster distance). The silhouette of 
X is defined as: 

^ /3(x) - a(x) 

max(/3(x), a(x)) 

Nofe fhaf <?i(x) approaches 1 as fhe separation befween fhe clusfer x is assigned fo and the nearest other cluster 
increases. The average silhouette of every clustered vector defines fhe silhoueffe coefficienf of a clusfering C^: 


SC, 


Ck 


ExgX 

|X| 


where X is the set of all data vectors. Previous studies indicate that values of greater than 0.7 means the 
algorithm achieved superior separation, and values between 0.5 and 0.7 indicate a reasonable separation Il8^ . 
For a given value of k, we ran 50 /c-means clusterings over the PCA-reduced conditional triad censuses using 
different random initializations of the centroid positions. Figure plots the average SCfjk of these trials for 
2 < A: < 9. It reveals excellent clustering solutions at k = 3 and k = 2 clusters for the Facebook and Wikipedia 
censuses, with silhouette coefficients of 0.73 and 0.90, respectively. A qualitative validation of the adequacy of 
a clustering solution is also given in Figure Here, the conditional triad censuses in a space defined by fhe 
firsf fhree principle componenfs are assigned a marking and color corresponding fo their cluster assignment. The 
Facebook clustering solution, given in the top panels of the figure, discovers a role (fhe red clusfer of circle poinfs) 
fhaf exhibifs large variation along fwo principle componenfs. In confrasf, a second role (fhe green clusfer of square 
poinfs) varies sfrongly along fhe third component. The smallest cluster (blue cluster of triangle points) only varies 
along the first component. Since the clusters exhibit little variation along different directions, different subsets 
of conditional triads must appear in similar proportions within the censuses of the same group. The Wikipedia 
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(a) Principle Comp. 1 and 2 (b) Principle Comp. 1 and 3 


(c) Principle Comp. 2 and 3 


Figure 9: Clusters along the first three principle components for Facebook (top) and Wikipedia (bottom) 


clustering solution, given in the bottom panels of Figure]^ also finds fhat the two clusters vary along the direction 
of different principle components: the red cluster of circle points vary along the second and third components, 
while the blue cluster of triangle points mainly varies along the first component. 

4 Triad-based Role Analysis 

In this section, the kinds of social roles that emerge from our clustering analysis is analyzed. For this purpose, 
the average centroid positions C* over a clustering result was identified and the user u* whose conditional triad 
census is located closest to C* was found, u* is defined as the “central user” of role i whose ego-network is 
the “central structure” of the role. Due to its position in the cluster, this “central structure” represents the way 
a prototypical user having this role embeds herself within the social system. In other words, the ego-network 
structure of users in role i are most similar to C* compared to any other central structure on the network. Each 
central structure is given a social role label based on a subjective interpretation of the user’s position within it. 
The label captures the way users of a role interact with others in the system, and how the structure representing a 
role affects the kinds of interactions that are possible. The role labels may not be applicable to all social systems, 
although it is feasible that systems created under a similar context (e.g. social sharing sites) exhibit similar central 
role structures and labels. The central role structures discovered in the Facebook and Wikipedia networks, and 
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Role label 

Sfrucfure 

Proportion of users 

Social Group Manager 

Figure 

10 

» 

56.6% 

Exclusive Group Parficipanf 

Figure 

10 

:b) 

28.4% 

Information Absorber 

Figure 

10 

c) 

15.0% 


Table 2: Facebook roles 


support for the emergence of these roles in the literature, are presented next. 

4.1 Facebook 

Figure [represents the central role structures of the three social roles found on Facebook. A label representing 
each role structure and the proportion of users falling under each are presented in Tablej^ In these figures, the red 
node (with a red arrow pointing to it) corresponds to the central user and the blue nodes are the members of her 
ego-network. The structure in Figure [T0| a) represents a social role the majority of all Facebook users (56.6%) fall 
into where a user is centrally embedded between many disconnected groups of others. She lies in a position critical 
for maintaining connectivity between communities, and hence, lies in the brokerage position of many open triads. 
These many open triads give users in this role many opportunities to control if and how information exchanges 
from one group to another. However, given the fact that Facebook is used as a platform for social sharing, such 
users may never decide to share information between communities when they represent different social circles. For 
example, one can envision the user in Figure [TOj^ a) to be sitting between groups that may correspond to colleagues 
at work, relatives, personal friends, and work colleagues. A user may never want personal information shared 
among relatives to be revealed by work colleagues, and may want conversations, rumors, and other information 
shared among friends to never be exposed to family members and work colleagues. That a majority of Facebook 
users fall into a social role that brokers among many disconnected circles is not surprising; many past research 
studies have shown that most Facebook users face identity management and multiple presentation issues while 
interacting on the site ll2^ IMl IH [571 . Identification of these “social group managers” is thus a way of finding fhe 
bridges or weak fies |[T4l in fhe nefwork based on sfrucfural pafferns roofed in social fheory. 

28.4% of Facebook users fall info fhe role represented by fhe cenfral sfrucfure of Figure [TOj^b). This sfrucfure 
represenfs a user fhaf has surrounded herself around a web of inferacfions running befween her firsl-degree con- 
necfions. This small percenfage of users only participates in a single, fighl-knif community of ofhers rafher fhan 
managing many disconnecfed groups. Such a role may represenf users who only choose fo ‘friend’ and inferacf 
wifh a collecfion of ofhers fhaf share many mufual connections, and does nof need fo manage mulfiple discon- 
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(a) (b) (c) 

Figure 10: Central role structures: Facebook 

nected social circles. Such patterns are known to be more prominent in the ego-networks of Facebook users who 
are more willing to share information with many others, or does not feel a need to consider identity management 
on the social network HHIhSlIT^fT^ . Such “exclusive group participants” may therefore promote the use of Face- 
book as a genuine social sharing platform, and be instrumental in the development of dense interaction clusters in 
the structure of the network. 

Figure [TOl^c) corresponds to the 15% of users who are positioned at the periphery of a single alter that interacts 
with many others. Since the structure corresponds to an average or typical ego-network structure for users in this 
role, it signifies a group of users who are passive and seldom share information with others. When they do share, 
it tends to be with those who the user has a mutual association with. Furthermore, these users tend to receive 
information from alters that share prolifically. The phenomenon of over-active or extraordinarily well connected 
users on online social systems is well-studied ll67l[54ll55l . but it is interesting to discover that the users connected 
to them to also play an important role in the online system. These users ‘absorb’ the information of the over-active 
others, since they only forward such information to those already connected to the over-active source. In fact, a 
modem use of the Facebook platform is to “absorb information” from friends and news organizations rather than 
to share social information, as reflected by this social role ll^ lMlIMIl . 
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Role label 


Structure Proportion of users 


Interdisciplinary Contributor Figure 11 a) 


89.7% 


Technical Editor Figure 11 b) 


10.3% 


Table 3: Wikipedia roles 


4.2 Wikipedia 

The density of the central structures of the Wikipedia social roles shown in Figure is a result of the many 
different ways interactions are defined, which includes content editing, reverting a change, or voting on a pending 
action by another user. The triad-based analysis revealed two types of roles in Wikipeida. The first role is taken on 
by the majority of all users (89.7%) and has the central structure shown in Figure [TT]^a). The structure shows a user 
whose work is being changed by active alters that make changes to articles from many other authors as well. It is 
interesting that these the active alters seldom edit content added by a common individual (e.g. have few mutual 
connections), even though they are prolific edifors. Such a paffern may emerge when fhese alfers have differenf 
expertise and concenfrafe on edifing confribufions fhaf fall wifhin fheir specific domain. The exisfence of fhese 
‘hubs’ of edifing acfivify is nof a surprising finding, as pasf work has confirmed fhaf mosf edifors on Wikipedia 
do exhibif domain-specific expertise and limif fheir edifs arficles in fheir domain ||9^ . Users falling under fhis 
sfrucfure musf Iherefore be confribufing fo inferdisciplinary arficles, which mosf Wikipedia arficles are classified 
as i^ . Such “inferdisciplinary confribufors” represenf fhe vasf majorify of users (89.7%) and is fhus fhe primary 
role fhaf adds informalion fo Wikipedia. 

The remaining 10.3% of users fall under fhe role whose cenfral sfrucfure is given in Figure[TT]^b). Two alfers fhaf 
fake fhe form of a hub (a domain-specific experf) can be seen, buf fhe overlap befween fhem is larger and denser 
in comparison fo Figure [TTJa). The cenfral user is posifioned wifhin fhis overlap. Users in fhis role Iherefore edif 
fhe confribufions of many, and find fheir confribufions edifed by many ofhers as well. A plausible explanafion for 
finding a dense core befween fhe positions of domain experfs is fhaf fhey perform ‘general’ edifs fhaf refiecf fhe 
language, grammar, spelling, hyperlinking, and sfrucfure of arficles. Changes made by fhese “fechnical edifors” 
may be furfher refined by a large number of ofher edifors fo furfher refine fhe fechnical discussion or fhe presenfa- 
fion and language of an article. This explanation is compatible wifh pasf observations of users fhaf concenfrafe on 
edifs relafed fo fhe language and formal of an arlicle Il92l . 

In summary, fhe analysis demonsfrafes fhe use of conditional friad censuses fo exfracf social roles from differenf 
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Figure 11: Central role structures: Wikipedia 

types of online social systems. It naturally discovered social roles on Facebook corresponding to users who 
maintain connectivity across many disconnected social groups (“ social group managers”), who participate in well- 
connected groups (“exclusive group participants”) that generate many social interactions, and passive users that 
serve as an outlet for over-active others to share information with (“information absorbers”) and may use Facebook 
as a platform to receive news. The roles discovered on Wikipedia focuses on the nature of the user’s contribution 
to the content of the online encyclopedia. A majority of users (“interdisciplinary contributors”) are devoted to 
articles that attract the attention of editors focusing on different subsets of articles, which may correspond to the 
actions of a domain-specific expert. The attraction of many experts suggests that the article the central user focuses 
on is interdisciplinary in nature. A minority of users (“technical editors”) edit many articles at once, and have their 
articles edited by many others as well. These users may thus be domain-specific experts or could be users that 
apply general language and formatting changes to many articles on the site. 

4.3 Applying social role analysis 

Triad-based social role analysis offers not only insights into the nature of user behaviors on social systems, but 
also a practical tool for exploring social theories. For example, consider a researcher wishing to study whether or 
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(a) (b) (c) 

Figure 12: Central role structures: UC Irvine online network 

not the reasons and ways users interact with each other on Facebook is due to its inclusive, public nature. This 
research question may be explored by comparing the social roles that emerge on Facebook with a different online 
social network that is not inclusive and public, but exclusive and private. Differences in the number, shape and 
proportion of users falling into social roles across the two systems may give evidence of a relationship between 
the public or private nature of a social system and why people participate in it. To illustrate this, a data set of 
interactions recorded from a private online social network for students at the University of California Irvine (UC 
Irvine) is considered 117210 The six-month long data set consists of 1,899 ties between users, with a directed tie 
from user Ato B established when A sends at least one message to B. Triad-based social role analysis on the 
UC Irvine network revealed the best clustering solution at A: = 3 roles {C^ = 0.713). Figure [l^ visualizes the 
central structure of the resulting role clusters, which exhibit very similar features to the central structures of the 
social roles on Facebook. For example. Figures [T0|a) and [T^a) both have a user situated between two groups 
of others. Figures [T^b) and [T^b) find the user in the center of a well connected community, and Figures [T^c) 
andfT^c) shows the user sitting at the periphery of a highly active alter. An analyst may therefore consider the two 
networks to exhibit the same social roles, and hence, conclude users utilize the network for similar reasons and 
in similar ways. Given the fact that Facebook and the UC Irvine social networks were created to facilitate social 

^It should be emphasized that a complete study of this research question requires a comprehensive analysis of user behaviors, and 
extensive comparisons between many different social network datasets. The illustration that follows is limited, and is only meant to 
demonstrate how social role analysis can be used as a useful research tool. 
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Proportion of Users 

Role Label 

UC Irvine 

Facebook 

Social Group Manager 

3.06% 

56.6% 

Exclusive Group Parficipanf 

92.9% 

28.4% 

Informalion Absorber 

4.04% 

15.0% 


Table 4: Comparing the proportion of social roles on Facebook and UC Irvine networks 

communication and connection, it is not surprising to find similar roles and central structures emerging. 

Comparison of the shape and the proportion of users falling into the central role structures, however, reveal 
significant differences between the private UC Irvine and public Facebook online social networks. For example, 
the central role structure of social group managers in the UC Irvine network finds fhe ego fo be sifuafed befween 
a smaller number of groups compared fo Facebook, and has an addifional alfer managing fhe same sef of social 
groups. These differences may arise because fhe separate groups an individual parficipafes in wifhin a privafe 
social nefwork fhaf is smaller in scope and encompasses fewer fypes of people may be less fhan a public social 
nefwork fhaf can include family, social, and work confacfs. Furthermore, Table shows fhe proportion of users 
falling info fhe social roles of fhe fwo nefworks fo be very differenl. The majorify of users in fhe UC Irvine nefwork 
are exclusive group parficipanfs, fhaf is, fhey are found fo be embedded wifhin a fighf social group and do nol need 
fo manage a membership in many separafe ones. In facl, only 3.06% of UC Irvine users acf as a social group 
manager, compared fo fhe 56.6% of Facebook users fhaf fake on fhis role. This difference may be roofed in fhe facf 
fhaf ifs users are all sfudenfs of UC Irvine, and hence, may exhibif homophobe fendencies fhrough common class, 
sfanding, housing, major, college, and club affiliafions. The many ways by which users could exhibif homophily 
on fhe UC Irvine nefwork may also explain why fhe social group manager cenfral struefure has an alfer managing 
fhe same sef of groups as fhe ego; bofh could be managing groups of colleagues from fhe same class and club. 
The public nafure of Facebook, however, may be reducing fhe level of homophily among a user’s conneefions. An 
analysf may poinf fo fhese hndings as key differences befween public and private online social nefworks, and as a 
rafionale fo explore new hypofheses involving a comparison of homophobe fendencies wifhin fhem. 

5 Further Opportunities for Large Scale Social Role Analysis 

Based on fhe related work discussed in Seefionj^and on a reflecfion of fhe proposed friad-based mefhod, fhis 
seefion summarizes addifional challenges and opportunifies fhaf exisf in social role analysis for large scale social 
sysfems. Opporfunifies along fwo imporfanf direefions are considered: (i) hnding meaningful feafures for role 
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extraction; and (ii) understanding the relationship between functional and social roles. 


5.1 Linking representation with social theory 

As discussed in Section [24l many data-driven analyses select a large collection of structural, user, and relation¬ 
ship attributes, and use them all to discover the social roles within a system. However, this may be a dangerous 
practice because the resulting roles are defined to be according to the ‘similarity’ of a complex mixture of many 
variables. Furthermore, many quantitative structural, user, and relationship features do not necessarily have a close 
correspondence to a sociological theory that is related to the concept of a user role. For example, structural features 
such as the clustering coefficient or betweenness centrality of a user within her ego-network can quantify how clus¬ 
tered its structure is, but does not identify the telling patterns of the interactions within it. Analysis that use a large 
number of features thus lead to a separation of users into roles that must be defined very broadly, or where ego- 
nefwork sfrucfures wifhin roles may be discordanf and have have few inferprefable sfrucfural regularifies. Some 
mefhods using a large collecfion of fealures also apply posf-processing sfeps fo fhe resulfing groups ifTSl 11021 . 
which may further disfort any inferprefafion of fhe exfracfed roles. 

This article lakes a slep loward fhe exclusive use of fealures lhaf carry a specific social inferprefafion. However, 
if may be fhe case lhaf addifional fealures associated wilh social Iheories may improve fhe fidelily of fhe melhod’s 
resulls, or lhaf a differenl unsupervised learning algorilhm should be used. For example. Field et al. note fhe 
imporfance of preserving nol only inferaclions, buf also affiliation information belween users in a social system 
lo define Iheir position ll34l . Such a concepl may be operafionalized in a richer dalasel conlaining affilialion 
informalion, by incorporating similarily measures of fhe rows of a <7 x n binary incidence malrix whose row 
and column is 1 if user j is affiliated wilh group i. Anolher relaled concepl is fhe imporfance of social 
influence lo fhe way if impacls a user’s social role ll^ . Forlunalely, Ihere have been many measures proposed 
for quantifying influence lhaf may be integrated info fhe social role mining process |[7^[36l[T9l 160112011511 . If is 
Ihese kinds of faclors, inslead of convenienlly chosen sfrucfural and user fealures, lhaf should be considered when 
grouping users info social roles. 

5.2 Linking functional and social roles 

In an offline selling, people can inleracl, converse, and exchange ideas wilh each ofher in virlually innumerable 
ways. However, mosl large scale social dalasels come from online systems lhaf only offer a limited number 
of well-defined ways for people lo inleracl wilh one anolher. If may be inluilive lo Ihink lhaf Ihese modes of 
inferaclions, which rellecf fhe functional ways users participate on fhe social sysfem, are associated or have an 
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effect on the social roles they go on to exhibit. For example, the roles identified over Wikipedia in this article 
were more closely related to the types of interactions allowed by the service (as an expert editing content or a 
generalist editing language form). The functionality provided by Facebook may also have helped users fall into 
specific social roles; for example, users only parficipafing in a cooperative group of ofhers may leverage fhe abilify 
fo choose whaf friendships fo accepf on Facebook, so fhaf fhe group fhey are embedded in is cohesive. The idea 
fhaf users can only inferacf wifh ofhers in a limifed number of ways is a unique properfy of online social sysfems 
compared fo offline ones. Thus, advanced fealures used fo discover social roles may also need fo be associafed wifh 
fhe differenl funclionalifies of an online social service, wifh values fhaf reflecl whaf funclions and how frequenlly 
fhey are used. Such fealures fhaf are found fo be ‘significanl’ across classes of users falling under fhe same social 
role may signal an associafion befween fhe funclionalily of a social sysfem and ifs social roles. 

6 Concluding Remarks 

This arficle presenfed a mefhodology fo discover social roles in large scale social sysfems. The dala-driven 
approach, roofed in fhe represenlalion of ego-nelworks as a condilional Iriad census and implemenled wifh a simple 
unsupervised learner was applied fo fwo differenl online social systems. Slruclural analysis of fhe ego-nelworks 
falling closesl fo fhe cenler of clusters of users wifh similar condilional Iriad censuses suggested fhe presence 
of users on Facebook fhaf exclusively manage disconnecl social circles or participate in a highly collaboralive 
singular one. If also found how confenl posted on Wikipedia may allracl eilher fhe allenfion of a number of domain 
experls, or of mulliple generalisl edilors. The dala-driven approach was molivaled by a comparalive analysis of fhe 
existing equivalence based, implied, and dala-driven role discovery melhods fhaf had been proposed. If concluded 
by suggesling fhe inlegralion of social Iheories fo derive fealures for role mining, and approaches lo link logelher 
Ihe notion of whal a user can do on a social system wilh her social role on il. Fulure work should explore Ihese 
opporlunilies, and may also consider unsupervised learners lhal allow users lo fall into multiple role assignmenls. 
Il is hoped lhal Ibis imporlanl topic will continue to gain more allenfion in Ihe compulalional social nelwork 
analysis and mining communily. 
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